Submit to Algorithms Review for Algorithms Propose a Special Issue

Journal Menu

Journal Browser

Data Compression

Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Databases and Data Structures".

Deadline for manuscript submissions: closed (30 September 2009) | Viewed by 74352

Share This Special Issue

Special Issue Editor

Dr. David Salomon

E-Mail Website
Guest Editor

Computer Science Department (Retired), California State University, Northridge, CA 91330-8281, USA
Interests: computer graphics; data compression; cryptography

Special Issue Information

Dear Colleagues,

Data compression is the operation of converting an input data file to a smaller file. This operation is important for the following reasons: 1. People like to accumulate data. Thus, no matter how big a storage device one has, sooner or later it is going to fill up. 2. People hate to wait for data transfers. We often upload and download files from our computers and we hate to wait for long, slow data transfers. How can data be compressed? We can represent the same amount of information in fewer bits because the original data representation is not the shortest possible. It is intentionally long in order to simplify processing the data. We say that our data representations have redundancies. Compressing data is done by locating its redundancies and reducing or eliminating them. Thus, the field of data compression tries to understand the sources of redundancies in different types of data and find clever methods to eliminate them. Today, after decades of research, there are hundreds of algorithms and dozens of implementations that can reduce the size of all types of digital data. It is my hope that this issue of Algorithms will make a significant contribution toward this goal.

Dr. David Salomon
Guest Editor

Keywords

data compression
data coding
source coding
information theory
entropy
data redundancy
variable-length codes

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

23 pages, 580 KiB

Open AccessArticle

Suffix-Sorting via Shannon-Fano-Elias Codes

by Donald Adjeroh and Fei Nan

Algorithms 2010, 3(2), 145-167; https://doi.org/10.3390/a3020145 - 1 Apr 2010

Cited by 12 | Viewed by 12404

Abstract

Given a sequence T = t₀t₁ . . . t_n-1 of size n = |T|, with symbols from a fixed alphabet Σ, (|Σ| ≤ n), the suffix array provides a listing of all the suffixes of T in [...] Read more.

Given a sequence T = t₀t₁ . . . t_n-1 of size n = |T|, with symbols from a fixed alphabet Σ, (|Σ| ≤ n), the suffix array provides a listing of all the suffixes of T in a lexicographic order. Given T, the suffix sorting problem is to construct its suffix array. The direct suffix sorting problem is to construct the suffix array of T directly without using the suffix tree data structure. While algorithims for linear time, linear space direct suffix sorting have been proposed, the actual constant in the linear space is still a major concern, given that the applications of suffix trees and suffix arrays (such as in whole-genome analysis) often involve huge data sets. In this work, we reduce the gap between current results and the minimal space requirement. We introduce an algorithm for the direct suffix sorting problem with worst case time complexity in O(n), requiring only

(1 \frac{2}{3} n \log n - n \log | \sum | + O (1))

bits in memory space. This implies

5 \frac{2}{3} n + O (1)

bytes for total space requirment, (including space for both the output suffix array and the input sequence T) assuming

n \leq 2^{32}, | \sum | \leq 256

, and 4 bytes per integer. The basis of our algorithm is an extension of Shannon-Fano-Elias codes used in source coding and information theory. This is the first time information-theoretic methods have been used as the basis for solving the suffix sorting problem. Full article

(This article belongs to the Special Issue Data Compression)

► Show Figures

Figure 1

13 pages, 375 KiB

Open AccessArticle

Interactive Compression of Digital Data

by Bruno Carpentieri

Algorithms 2010, 3(1), 63-75; https://doi.org/10.3390/a3010063 - 29 Jan 2010

Cited by 4 | Viewed by 11964

Abstract

If we can use previous knowledge of the source (or the knowledge of a source that is correlated to the one we want to compress) to exploit the compression process then we can have significant gains in compression. By doing this in the fundamental source coding theorem we can substitute entropy with conditional entropy and we have a new theoretical limit that allows for better compression. To do this, when data compression is used for data transmission, we can assume some degree of interaction between the compressor and the decompressor that can allow a more efficient usage of the previous knowledge they both have of the source. In this paper we review previous work that applies interactive approaches to data compression and discuss this possibility. Full article

(This article belongs to the Special Issue Data Compression)

► Show Figures

Figure 1

20 pages, 267 KiB

Open AccessArticle

Linear-Time Text Compression by Longest-First Substitution

by Ryosuke Nakamura, Shunsuke Inenaga, Hideo Bannai, Takashi Funamoto, Masayuki Takeda and Ayumi Shinohara

Algorithms 2009, 2(4), 1429-1448; https://doi.org/10.3390/a2041429 - 25 Nov 2009

Cited by 17 | Viewed by 10459

Abstract

We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented. Full article

(This article belongs to the Special Issue Data Compression)

► Show Figures

Figure 1

11 pages, 303 KiB

Open AccessArticle

Multiplication Symmetric Convolution Property for Discrete Trigonometric Transforms

by Do Nyeon Kim and K. R. Rao

Algorithms 2009, 2(3), 1221-1231; https://doi.org/10.3390/a2031221 - 22 Sep 2009

Viewed by 8936

Abstract

The symmetric-convolution multiplication (SCM) property of discrete trigonometric transforms (DTTs) based on unitary transform matrices is developed. Then as the reciprocity of this property, the novel multiplication symmetric-convolution (MSC) property of discrete trigonometric transforms, is developed. Full article

(This article belongs to the Special Issue Data Compression)

► Show Figures

Figure 1

32 pages, 334 KiB

Open AccessArticle

Approximate String Matching with Compressed Indexes

by Luís M. S. Russo, Gonzalo Navarro, Arlindo L. Oliveira and Pedro Morales

Algorithms 2009, 2(3), 1105-1136; https://doi.org/10.3390/a2031105 - 10 Sep 2009

Cited by 27 | Viewed by 12810

Abstract

A compressed full-text self-index for a text T is a data structure requiring reduced space and able to search for patterns P in T. It can also reproduce any substring of T, thus actually replacing T. Despite the recent explosion of interest on compressed indexes, there has not been much progress on functionalities beyond the basic exact search. In this paper we focus on indexed approximate string matching (ASM), which is of great interest, say, in bioinformatics. We study ASM algorithms for Lempel-Ziv compressed indexes and for compressed suffix trees/arrays. Most compressed self-indexes belong to one of these classes. We start by adapting the classical method of partitioning into exact search to self-indexes, and optimize it over a representative of either class of self-index. Then, we show that a Lempel- Ziv index can be seen as an extension of the classical q-samples index. We give new insights on this type of index, which can be of independent interest, and then apply them to a Lempel- Ziv index. Finally, we improve hierarchical verification, a successful technique for sequential searching, so as to extend the matches of pattern pieces to the left or right. Most compressed suffix trees/arrays support the required bidirectionality, thus enabling the implementation of the improved technique. In turn, the improved verification largely reduces the accesses to the text, which are expensive in self-indexes. We show experimentally that our algorithms are competitive and provide useful space-time tradeoffs compared to classical indexes. Full article

(This article belongs to the Special Issue Data Compression)

► Show Figures

Figure 1

14 pages, 261 KiB

Open AccessArticle

Graph Compression by BFS

by Alberto Apostolico and Guido Drovandi

Algorithms 2009, 2(3), 1031-1044; https://doi.org/10.3390/a2031031 - 25 Aug 2009

Cited by 82 | Viewed by 16577

Abstract

The Web Graph is a large-scale graph that does not fit in main memory, so that lossless compression methods have been proposed for it. This paper introduces a compression scheme that combines efficient storage with fast retrieval for the information in a node. The scheme exploits the properties of the Web Graph without assuming an ordering of the URLs, so that it may be applied to more general graphs. Tests on some datasets of use achieve space savings of about 10% over existing methods. Full article

(This article belongs to the Special Issue Data Compression)

► Show Figures

Journal Menu

Journal Browser

Data Compression

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (6 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI